Search CORE

18 research outputs found

WW1LOD: an application of CIDOC-CRM to World War 1 linked data

Author: Eero Hyvönen
Eetu Mäkelä
Juha Törnroos
Thea Lindquist
Publication venue: 'Modern Language Association'
Publication date: 01/01/2017
Field of study

The CIDOC-CRM standard indicates that common events, actors, places and timeframes are important in linking together cultural material, and provides a framework for describing them. However, merely describing entities in this way in two datasets does not yet interlink them. To do that, the identities of instances still need to be either reconciled, or be based on a shared vocabulary. The WW1LOD dataset presented in this paper was created to facilitate both of these approaches for collections dealing with the First World War. For this purpose, the dataset includes events, places, agents, times, keywords, and themes related to the war, based on over ten different authoritative data sources from providers such as the Imperial War Museum. The content is harmonized into RDF, and published as a Linked Open Data service. While generally basing on CIDOC-CRM, some modeling choices used also deviate from it where our experience dictated such. In the article, these deviations are discussed in the hope that they may serve as examples where CIDOC-CRM itself may warrant further examination. As a demonstration of use, the dataset and online service have been used to create a contextual reader application that is able link together and pull in information related to WW1 from e.g. 1914–1918 Online, Wikipedia, WW1 Discovery, Europeana and the Digital Public Library of America

Humanities Commons

Beacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond

Author: Ariosa Roberto
Baudis Michael
Beck Tim
Brookes Anthony J
Fromont Lauren A
Navarro Arcadi
Paloots Rahel
Rambla Jordi
Rueda Manuel
Saunders Gary
Singh Babita
Spalding John D
Törnroos Juha
Vasallo Claudia
Veal Colin D
Publication venue: 'Wiley'
Publication date: 17/03/2022
Field of study

Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers. The popularity of this concept has fostered the design of a version 2, that better serves real-world requirements and addresses the needs of clinical genomics research and healthcare, as assessed by several contributing projects and organizations. Particularly, rare disease genetics and cancer research will benefit from new case level and genomic variant level requests and the enabling of richer phenotype and clinical queries as well as support for fuzzy searches. Beacon is designed as a "lingua franca" to bridge data collections hosted in software solutions with different and rich interfaces. Beacon version 2 works alongside popular standards like Phenopackets, OMOP, or FHIR, allowing implementing consortia to return matches in beacon responses and provide a handover to their preferred data exchange format. The protocol is being explored by other research domains and is being tested in several international projects

ZORA

GA4GH: International policies and standards for data sharing across genomic research and healthcare.

Author: Adams Jeremy B
Alterovitz Gil
Auvil Jaime M Guidry
Babb Lawrence J
Barkley Maxmillian P
Baudis Michael
Beauvais Michael JS
Beck Tim
Beckmann Jacques S
Beltran Sergi
Bernick David
Bernier Alexander
Birney Ewan
Bonfield James K
Boughtwood Tiffany F
Bourque Guillaume
Bowers Sarion R
Brookes Anthony J
Brudno Michael
Brush Matthew H
Bujold David
Burdett Tony
Buske Orion J
Cabili Moran N
Cameron Daniel L
Carroll Robert J
Casas-Silva Esmeralda
Chakravarty Debyani
Chaudhari Bimal P
Chen Shu Hui
Cherry J Michael
Chung Justina
Cline Melissa
Clissold Hayley L
Cook-Deegan Robert M
Courtot Mélanie
Cunningham Fiona
Cupak Miro
Davies Robert M
Denisko Danielle
Doerr Megan J
Dolman Lena I
Dove Edward S
Dursi L Jonathan
Dyke Stephanie OM
Eddy James A
Eilbeck Karen
Ellrott Kyle P
Fairley Susan
Fakhro Khalid A
Firth Helen V
Fitzsimons Michael S
Fiume Marc
Flicek Paul
Fore Ian M
Freeberg Mallory A
Freimuth Robert R
Fromont Lauren A
Fuerth Jonathan
Gaff Clara L
Gan Weiniu
Ghanaim Elena M
Glazer David
Goodhand Peter
Green Robert C
Griffith Malachi
Griffith Obi L
Grossman Robert L
Groza Tudor
Guigó Roderic
Guimera Roman Valls
Gupta Dipayan
Haendel Melissa A
Hamosh Ada
Hansen David P
Hart Reece K
Hartley Dean Mitchell
Haussler David
Hendricks-Sturrup Rachele M
Ho Calvin WL
Hobb Ashley E
Hoffman Michael M
Hofmann Oliver M
Holub Petr
Hsu Jacob Shujui
Hubaux Jean-Pierre
Hunt Sarah E
Husami Ammar
Jacobsen Julius O
Jamuar Saumya S
Janes Elizabeth L
Jeanson Francis
Jené Aina
Johns Amber L
Joly Yann
Jones Steven JM
Kanitz Alexander
Kato Kazuto
Keane Thomas M
Kekesi-Lafrance Kristina
Kelleher Jerome
Kerry Giselle
Khor Seik-Soon
Knoppers Bartha M
Konopko Melissa A
Kosaki Kenjiro
Kuba Martin
Lawson Jonathan
Leinonen Rasko
Li Stephanie
Lin Michael F
Linden Mikael
Liu Xianglin
Lopez Javier
Lucassen Anneke M
Lukowski Michael
Mann Alice L
Marshall John
Mattioni Michele
Metke-Jimenez Alejandro
Middleton Anna
Milne Richard J
Molnár-Gábor Fruzsina
Mulder Nicola
Munoz-Torres Monica C
Nag Rishi
Nakagawa Hidewaki
Nasir Jamal
Navarro Arcadi
Nelson Tristan H
Niewielska Ania
Nisselle Amy
Niu Jeffrey
North Kathryn
Nyrönen Tommi H
O'Connor Brian D
Oesterle Sabine
Ogishima Soichi
Page Angela JH
Paglione Laura AD
Palumbo Emilio
Parkinson Helen E
Philippakis Anthony A
Pizarro Angel D
Prlic Andreas
Rambla Jordi
Rehm Heidi L
Rendon Augusto
Rider Renee A
Robinson Peter N
Rodarmer Kurt W
Rodriguez Laura Lyman
Rubin Alan F
Rueda Manuel
Rushton Gregory A
Ryan Rosalyn S
Saunders Gary I
Schuilenburg Helen
Schwede Torsten
Scollen Serena
Senf Alexander
Sheffield Nathan C
Skantharajah Neerjah
Smith Albert V
Smith Lindsay
Sofia Heidi J
Spalding Dylan
Spurdle Amanda B
Stark Zornitza
Stein Lincoln D
Suematsu Makoto
Tan Patrick
Tedds Jonathan A
Thomson Alastair A
Thorogood Adrian
Tickle Timothy L
Tokunaga Katsushi
Torrents David
Törnroos Juha
Udara Liyanage Isuru
Upchurch Sean
Valencia Alfonso
Vamathevan Jessica
Varma Susheel
Vears Danya F
Viner Coby
Voisin Craig
Wagner Alex H
Wallace Susan E
Walsh Brian P
Wang Vivian Ota
Williams Marc S
Winkler Eva C
Wold Barbara J
Wood Grant M
Woolley J Patrick
Yamasaki Chisato
Yates Andrew D
Yung Christina K
Zass Lyndon J
Zaytseva Ksenia
Zhang Junjun
Publication venue: Cell Genom
Publication date: 01/01/2021
Field of study

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits

The Jackson Laboratory: The Mouseion at the JAXlibrary

edoc

University of Northampton's Research Explorer

PubMed Central

Edinburgh Research Explorer

Ontologiaperusteisten tapahtumien tunnistus piilevän semantiikan analyysillä

Author: Törnroos Juha
Publication venue: Helsingin yliopisto
Publication date: 01/01/2012
Field of study

Perinteinen tekstihaku vertaa toisiinsa tekstistä löytyviä merkkijonoja, jolloin esimerkiksi hakusanalla 'Nokia' voidaan tulokseksi saada dokumentteja matkapuhelinvalmistajasta, Nokian kaupungista tai F.E Sillanpään Ihmiset suviyössä teoksen päähenkilöstä. Tässä tutkielmassa esitetään informaation haussa (engl. Information Retrieval, IR) käytettävä menetelmä, jolla on mahdollista hakea tekstidokumentteja tarkasti määritellyllä käsitteellä. Tarkasti määritellyllä käsitteellä tarkoitetaan ontologiassa, koneymmärrettävässä sanastossa, määriteltyä käsitettä. Tässä tutkielmassa keskitytään erityisesti historiaontologiassa määriteltyihin tapahtumiin. Tutkielmassa esitetty menetelmä pyrkii tunnistamaan dokumentissa esiintyvät käsitteet sanoja ympäröivän semantiikan perusteella. Täsmällisesti sanaa ympäröivä semantiikka saadaan niin kutsutusta semanttisesta avaruudesta, joka muodostetaan piilevän semantiikan analyysiksi (engl. Latent Semantic Analysis, LSA) kutsutulla matemaattisella menetelmällä, ja ympäröivää semantiikkaa sovelletaan ontologiseen kyselyn laajentamiseen. Mallin toimivuutta pyrittiin arvioimaan koejärjestelyllä, jossa aineistona käytetään Suomalaista historiaontologiaa ja suomenkielisen Wikipedia-tietosanakirjan artikkeleita. Koejärjestelyssä ilmenneiden vaikeuksien vuoksi toimivuuden arviointi jäi puutteelliseksi. Tutkielman lopussa on pohdittu menetelmän merkitystä informaation haussa yleisesti, sillä tutkielmassa kuvattu menetelmä ontologiassa määriteltyjen käsitteiden kuvaamisesta tekstidokumenttien määräämään semanttiseen avaruuteen on uusi, eikä aiempaa tutkimusta menetelmän toiminnasta tai kehittämisestä ole tehty

Helsingin yliopiston digitaalinen arkisto

Recommended from our members

WW1LOD

Author: Hyvönen Eero
Lindquist Thea
Mäkelä Eetu
Törnroos Juha
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2017
Field of study

The CIDOC-CRM standard indicates that common events, actors, places and timeframes are important in linking together cultural material, and provides a framework for describing them. However, merely describing entities in this way in two datasets does not yet interlink them. To do that, the identities of instances still need to be either reconciled, or be based on a shared vocabulary. The WW1LOD dataset presented in this paper was created to facilitate both of these approaches for collections dealing with the First World War. For this purpose, the dataset includes events, places, agents, times, keywords, and themes related to the war, based on over ten different authoritative data sources from providers such as the Imperial War Museum. The content is harmonized into RDF, and published as a Linked Open Data service. While generally based on CIDOC-CRM, some modeling choices used also deviate from it where our experience dictated such. In the article, these deviations are discussed in the hope that they may serveas examples where CIDOC-CRM itself may warrant further examination. As a demonstration of use, the dataset and online service have been used to create a contextual reader application that is able to link together and pull in information related to WW1 from, e.g., 1914–1918 Online, Wikipedia, WW1 Discovery, Europeana and the Digital Public Library of America.Peer reviewe

CU Scholar Institutional Repository

Aaltodoc Publication Archive

Modular Pre-Ingest Tool for Diverse Needs of Producers

Author: Koivunen Kimmo
Lehtonen Kuisma
Somerkoski Pauliina
Törnroos Juha
Vatanen Mikko
Publication venue
Publication date
Field of study

We introduce an open-source pre-ingest tool that assists the generation of Submission Information Packages (SIPs) that are to be submitted to the national digital preservation service in Finland. The pre-ingest tool consists of several independent components that produce the parts of a METS document required by the national preservation service. These components are easy to modify when developing services for different user demands or for different repositories. Users of the tool provide the necessary information as parameters for the tool, which produces the structure and descriptions for the SIP. The pre-ingest tool reduces the need to deeply understand either METS, PREMIS or other metadata formats to be able to preserve digital assets

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

IOS Press World War 1 as Linked Open Data

Author: Eero Hyvönen A
Eetu Mäkelä A
Juha Törnroos A
Thea Lindquist B
Publication venue
Publication date
Field of study

Abstract. The WW1LOD dataset is primarily a reference dataset meant to bind together collections dealing with the First World War. For this purpose, the dataset gathers events, places and agents related to the war from various authoritative sources. These are then made available for indexing and other use through a variety of interfaces and APIs. Additional information on the entities is also collected, in order to be able to answer more complex questions relating to them. The approach is being evaluated using a concrete WW1 online collection

CiteSeerX

Recommended from our members

Publisher Correction: Federated discovery and sharing of genomic data using Beacons.

Author: Baudis Michael
Brookes Anthony J
Carey Knox
Cupak Miroslav
de la Torre Sabela
Dolman Lena
Dyke Stephanie OM
Fiume Marc
Flicek Paul
Goodhand Peter
Haeussler Maximilian
Haussler David
Keenan Stephen
Lappalainen Ilkka
Linden Mikael
Lloyd David
Page Angela
Rambla Jordi
Saunders Gary
Scollen Serena
Sherry Stephen
Spalding J Dylan
Stockinger Heinz
Törnroos Juha
Ur-Rehman Saif
Varma Susheel
Publication venue: eScholarship, University of California
Publication date: 01/04/2019
Field of study

In the version of this article initially published, Lena Dolman's second affiliation was given as Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. The correct second affiliation is Ontario Institute for Cancer Research, Toronto, Ontario, Canada. The error has been corrected in the HTML and PDF versions of the article

eScholarship - University of California